Identifying Causal Effects with Computer Algebra 
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Abstract 



The long-standing identification problem for 
causal effects in graphical models has many 
partial results but lacks a systematic study. 
We show how computer algebra can be used 
to either prove that a causal effect can be 
identified, generically identified, or show that 
the effect is not generically identifiable. We 
report on the results of our computations for 
linear structural equation models, where we 
determine precisely which causal effects are 
generically identifiable for all graphs on three 
and four vertices. 



1 INTRODUCTION 

Consider a parametric statistical model p, : — > 
P{X), that associates to a parameter vector 9 E 
a probability density peix) on the sample space X. 
Let s : — ?► K be a parameter of interest. The iden- 
tification problem asks: Does there exist a function <I> 
from the model pQ = {pg : 9 E &} to M such that 
^opg — s{9) for all e 8? If there is such a function, 
then the parameter s{6) is said to be identifiable (and 
$ is the identification formula) , and if there is no such 
function the parameter is not identifiable. The main 
focus of this paper is on generically identifiable param- 
eters (i.e. almost everywhere identifiable), which are 
identifiable expect possibly on a set of measure zero. 

We will describe a general framework using computer 
algebra for addressing such questions. Our perspec- 
tive is that, in most problems of interest in machine 
learning, both the map p, that associates a density 
to a parameter vector, and the parameter of interest 
s(9) are polynomial (or rational) functions of the pa- 
rameters 9. In this case, if the parameter is (generi- 
cally) identifiable, the identification formula must be, 
at worst, an algebraic function of the probability dis- 
tribution, and such an algebraic identification formula 
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can be detected or proven not to exist using Grobner 
basis computations. Some past work on using com- 
puter algebra for the identifiability problem includes 



( Geiger and Meek 1999) and (Merckens et al. 1994) 



Our motivation for laying down this general frame- 
work is to provide a systematic study of the identifi- 
ability of direct and indirect causal effects in causal 
graphical models. We provide a systematic study of 
the generic identifiability of linear structural equation 
models (SEMs) using our general framework in Section 
4. This problem has been much studied in the litera- 
ture of machine learning, graphical models, statistics, 
econometrics, etc. There exist many different graphi- 
cal criteria that guarantee that some particular causal 
effects can be identified or generically identified in- 
cluding the "single-door" and "instrumental variables" 
criteria for direct effects; the "back-door" criterion 



for total effects (see Pearl 2000); the "G-criterion" 
(Brito and Pearl, 2006); the various criteria introduced 

Other references among 



1999), (Robins 19871, and (^Simon 1953) 



by |Tian| ( |2004 ,2005, i2009[) ^ 

many include: ([Fisher 1966[), (Kuroki and Miyakawa 



Tian and Pearl ( 2002 ) gave an algorithm, proven com- 



plete by [Shpitser and Pearl ( |2006) and Huang and 
Valtorta (12006 ) for identification of parameters in non- 



parametric structural equation models (that is, an al- 
gorithm which decided identifiability depending only 
on the type of graph). While there are many con- 
ditions that exist for generic identifiability, there is 
no known necessary and sufficient condition to decide 
generic identifiability in either the general nonpara- 
metric case, or in specific situations (e.g. linear SEMs). 
One goal in this paper is to provide a systematic study 
to classify the linear SEMs on small numbers of vari- 
ables whose parameters are generically identifiable. 

In the next section, we describe the general algebraic 
framework for performing identifiability computations. 
In Section 3, we describe the problem for Gaussian 
structural equation models, and give examples of code 
that shows how to perform the computations from Sec- 



tion 2 for these models. In Section 4 we report on the 
results of our computations. 

2 COMPUTER ALGEBRA FOR 
PARAMETER IDENTIFICATION 

In this section, we describe the general framework for 
addressing identifiability problems using computer al- 
gebra. We refer the reader to (Cox et al. 20071) for 



background on computer algebra, ideals, and related 
topics which are used in this section. 

Let 9 C M'* be a full dimensional parameter set. In 
most applications is a convex subset of M.'^. Let 
]R[t] := IR[ti, . . . ,ici] denote the set of all polynomi- 
als in the indeterminates (i.e. polynomial variables) 
ti,t2, ■ ■ ■ ,td with real coefficients. The set IR[t] is 
called the polynomial ring, a ring being an algebraic 
structure with compatible addition and multiplication 
operations. Let /i,...,/„ G E[t] be polynomials. 
These polynomials define a function f : -^ M" by 
{{0) = (/i(6'), . . . , fniO))^. The image of f is the set 
f(e) :-{f(6l) :6'ee}. 

We define a parameter to be a polynomial function s : 
— >■ M which is not constant on Q. The parameter s is 
identifiable if there exists a map $ : M" — > M such that 
s{e) = $ o f (61) for all 9 € &. Note the use of the word 
map, rather than function- we do not require that <I> 
be defined on all of M", but only on f (0). This leads 
to our next definition. The parameter s is generically 
identifiable if there exists a map <& : M" — )• M and a 
dense open subset C/ of 6 such that s{9) = $of(0) for 
all 9 G U. Generic identifiability is also called almost 
everywhere identifiability in the literature. 

An important special case concerns the coordinate 
functions s{9) — 9i. If all these functions are (generi- 
cally) identifiable there exists a (generic) inverse map 
to the function f, in which case every parameter is 
(generically) identifiable. 

Example 1. Let 9 = (wii,u;22, 0^23,^33, A12, A23), and 
let 

O = |(a;ii,ci;22, W23, W33, A12, A23) e M : 

wii > 0,ClI22 > 0,^33 > 0,a;iia;22 > ^^2} ■ 
Let f : M^ ^ M*^ given by 

hi{9) = wii 

/l2(^) == W11A12 

/l3(^) = W11A12A23 

f22{9) — W22+ 1^11^12 

123(9) = l^22A23 + '^llAi2'^23 + "^23 

fsai^) — W33 + aj22A23 + W23A23 + u;iiA]^2A23- 



Since ujii > in 9, A12 is identifiable by the formula 
A12 = Y^- On the other hand, the parameter A23 

is generically identifiable by the formula A23 = jp-- 
But this formula does not show that A23 is identifiable, 
because /12 = for some values in f (9) in particular, 
whenever A12 — 0. In fact, it is not difficult to show 
that A23 cannot be determined whenever A12 = 0, and 
hence this parameter is not identifiable. 

To describe the setup for determining identifiability 
with computational algebra, we need to introduce the 
closely related notion of constraints. 

Let M[p] :— R[pi, . . . ,pn] be the polynomial ring in in- 
determinates pi, . . . ,pn. The vanishing ideal (or con- 
straint set) of S" C M" is the set 

T{S) :== {g e M[p] : .g(a) = for all a e S}. 

The vanishing ideal, as the name implies, is an ideal in 
the ring M[p], that is, it is closed under addition and 
under multiplication by an arbitrary polynomial. The 
simplest example of an ideal is the ideal generated by 
a collection of polynomials: 



(51 



,3r 



{Y.h. 



h e M[p]}. 



Hubert's basis theorem says that every ideal has a fi- 
nite generating set; that is, every ideal can be written 
as the ideal generated by a finite collection of polyno- 
mials. One of the advantages of working in the lan- 
guage of ideals and generating sets is that they allow 
for the computation of constraint sets. 
Proposition 2. Let f : Q C W^ -^ W he a polyno- 
mial parametrization from a full dimensional parame- 
ter space. Then 

z(f(0)) = (pi - /i(t), ...,pn~ /„(t)) n iR[p]. 

In particular, the constraints can be determined by 
eliminating the t-indeterminates. 

The intersection in Proposition [2] can be computed us- 
ing Grobner bases with elimination term orders, see 
below. 

Constraint sets can also be used to determine 
whether or not parameters are identifiable. In- 
deed, consider the modified parametrization map f — 
(s,/i, . . . ,/„)^ : 9 ^ M"+i. Let E[9,p] be the poly- 
nomial ring with one extra indeterminate correspond- 
ing to the parameter function s. Let Z(f(9)) be the 
vanishing ideal of the image. Then we have the follow- 
ing proposition. 

Proposition 3. Suppose that g{q,p) E I(f(9)) is 
a polynomial such that q appears in this polynomial, 
dil^P) = J2i=a9iiP)l^ and gd{p) does not belong to 
Z(f(9)). 



1. If g is linear in q, g = 5i(p)<Z — 3o(p) then s is 
generically identifiable by the formula s = ^°^^' , 
If, in addition, gi{p) 7^ for p e f(0) then s is 
identifiable. 

2. If g has higher degree d in q, then s may or may 
not be generically identifiable. Generically, there 
are at most d possible choices for the parameter 
s{9) given i{0). 

3. If no such polynomial g exists then the parameter 
s is not generically identifiable. 



Proof. The ideal X(f(8)) consists of all polynomials 
g{q,Tp) in q,Pi,---,Pn such that g{s{e),i{e)) = for 
all G 0. Suppose that there exists a polynomial g 
satisfying the conditions of the theorem. Since gd{v) 
does not belong to Z(f(0)) there exist a dense open 
subset [/ of 9 such that none of the (7i(f) are zero. 
Letting 6 £ U, we see that s{9) is one of the solutions 
to the nondegenerate equation g{q,{{9)) =0. If g is 
linear in q this equation has a unique solution, and 
hence s{9) is generically identifiable. If gi{p) ^ for 
all p S f(9) then we can take f/ = 8 and s{9) is 
identifiable. 

If g is the lowest degree polynomial in Z(f (8)) satis- 
fying the desired properties, and g is not linear, then 
s{9) will be one of the d complex solutions to the equa- 
tion g{q,f{9)) = 0. This may or may not (generically) 
identify s{9), depending on additional constraints on 
8. For instance, ii g{q,p) has the form 



5('7,P) 



iip)q'^ - .90 (p) 



and we know that s{9) > 0, then s{9) will be gener- 
ically identified. 8n the other hand, we will see an 
example in the next section due to Brito, where d = 2 
and the parameter is not identified. 

On the other hand, suppose that s{9) is generically 
identifiable on C/ C 8. Let p e f(C/). Let I C R[q] be 
the ideal generated by the evaluations of all polynomi- 
als g S I(f (0)) at the point p. Since M.[q] is a principal 
ideal domain, it has a single generator. If this gener- 
ator is the zero polynomial, then a priori, every value 
of s{9) in s(8) is compatible with p, but this contra- 
dicts identifiability because s is not constant, s(8) has 
more than one point. This implies that / contains a 
nonzero polynomial. This implies that Z(f(0)) must 
have contained a polynomial g with nonzero degree d 
in q with leading coefficient gd{p) ^ 2^(f (©)); for if the 
leading coefficient of every polynomial in Z(f(8)) is 
in Z(f (8)) then every coefficient of every polynomial 
in 2:(f(8)) is in 2:(f(8)). This implies that / = (0), 
which is a contradiction. D 



If the integer d > is the lowest nonzero degree in q 
of any polynomial in l(f(0)), then there are d com- 
plex values for the parameter s{9) that are compatible 
with f{9). We call such a parameter algebraically d- 
identifiable. As mentioned in Proposition [31 a param- 
eter that is algebraically d-identifiable may or might 
not be identifiable. A parameter is d-identifiable if 
there are d different 6*' G 8 su ch that f{9') = f (g) 
and s{9') are all distinct. See (AUman et al. 2009) 



for an example of a model in phylogenetics that is al- 
gebraically 12-identifiable but is (conjecturally) only 
8-identifiable, at worst. 

The existence or nonexistence of a polynomial g satis- 
fying the conditions of Proposition|3]can be decided by 
a Grobner basis computation, which we now explain. 
Basic information about Grobner basis can be found 



in (Cox, Little and O'Shea 2007). 



A term order -< on the polynomial ring IR[p] is a total 
ordering on the monomials in M[p] that is compati- 
ble with multiplication and such that 1 is the smallest 
monomial; that is, 1 = p° ^ p" for all u S N" and 
if p" ^ p^ then p^ • p" ^ P"^ • P''. Since ^ is a 
total ordering, every polynomial g £ K[p] has a well- 
defined largest monomial. Let in^{g) be the largest 
monomial appearing in g. For an ideal / C IR[p] let 
in^(/) = {m^{g) : g E I). This is called the ini- 
tial ideal of /. A finite subset ^ C / is called a 
Grobner basis for / with respect to the term order 
-< if in^(/) = {in^{g) : g E Q). The Grobner ba- 
sis is called reduced if the coefficient of in^ {g) in g is 
one for all g, each in^(g) is a minimal generator of 
in^ (/) , and no terms besides the initial terms of Q be- 
long to in^(J). For a fixed ideal / and term order -<, 
the reduced Grobner basis of / with respect to -< is 
uniquely determined. Note, however, that as the term 
order varies, the reduced Grobner basis of / will also 
change. 

Among the most important term orders is the lexico- 
graphic term order, which can be defined for any per- 
mutation of the variables. In the lexicographic term 
order we declare p" -< p^ if and only if the left most 
nonzero entry of v — u is positive. Stated colloquially, 
this is the term order that makes pi so expensive its 
degree dominates the term order. If two monomials 
have the same degree in pi, then we compare the de- 
grees of j>2j aud so on. Generalizing the lexicographic 
order are the elimination orders. These are obtained 
by splitting the variables into a partition A\JB. In the 
elimination order p" -< p^ if p"^ has larger degree in 
the A variables than p". If p'^ and p" have the same 
degree in the A variables, then some other term order 
is used to break ties. 

Computation of Grobner bases is via Buchberger's al- 



gorithm. A key ingredient to this algorithm is the di- 
vision algorithm of multivariate polynomials. Once a 
term ordering is fixed, the division algorithm of multi- 
variate polynomials consists in canceling leading terms 
until no term in the remainder can be divided by the 
leading term of the divisor much in the same way as 
the familiar division algorithm for univariate polyno- 
mials. The following proposition provides an algebraic 
procedure for the identification of parameters. 

Proposition 4. Let -< be an elimination term order 
with respect to the partition {q}U{pi, . . . ,Pn}- Let Q = 
{gi, . . . , Qn} be a reduced Grobner basis /or I(f (0)) 
with respect to the term order ^. The Grobner basis 
Q contains polynomials of the lowest nonzero degree 
of the form from Proposition [^ if such a polynomial 
exists. In particular, if no polynomial of Q contains the 
indeterminate q, then s is not generically identifiable. 

Proof. Let Q = {gi, . . . ,gn\ be a reduced Grobner ba- 
sis for I(f (0)) with respect to the elimination order 
-<. By virtue of being an elimination order, this set 
also contains a reduced Grobner basis for J(f(0)). 
Let Q' denote this reduced Grobner basis. Note that 
^' = ^ n M[p] by properties of elimination orders. 

Now, let g be a polynomial satisfying the conditions 
of Proposition [3] of the lowest nonzero degree in q. We 
can apply the division algorithm by Q' to g to get a 
remainder g. Since the leading coefficient of g in q is 
not in Z(f (0)), this leading coefficient does not reduce 
to zero by division by Q' . Thus, g has the same degree 
as g in q. Now in^(g) e in^(X(f(9))), thus va.^{g) is 
divisible by some leading term of some polynomial in 
Q . But by our reduction assumption it is not divisible 
by the leading monomial of any element of Q' . Hence, 
it must be divisible by some element oi h ^ Q whose 
leading term has a nonzero power of q in it. Since g 
has the lowest possible degree in g, then so does g, and 
so must h, to divide the leading term of g. 

On the other hand, if no such polynomial exists, then 
there could not be any q appearing in any elements of 
the reduced Grobner basis of I(f(8)). D 

Example 5. The following Macaulay2 code com- 
putes the unique lowest degree polynomial g{q, p) 
for the problem of determining the identification 
of the parameter A23. Instead of using indetermi- 
nates pi, . . . ,pq, we use sll,...,s33 to match the 
/ii, . . . , /33 in the parametrization. 

S = QQ[wll,w22,w23,w33,112,123] ; 
R = Qq[q,sll,sl2,sl3,s22,s23,s33, 

MonomialOrder => Eliminate 1] ; 
f = map(S,R.nia'trix{{ 
123, 
wll. 



wll*112, 

wll*112*123, 

w22 + wll*112~2, 

w22*123 + wll*112~2*123 + w23, 

w33 + w22*123~2 + w23*123 

+ wll*112-2*123"2}»; 
kernel f ; 

The output is the Grobner basis of the ideal I(f (6)) 
which consists of a single polynomial si2q — S13. 

3 GAUSSIAN STRUCTURAL 
EQUATION MODELS 

Let G — (V, D, B) be a graph with vertex set V ^ a 
set of directed edges D, and a set of bidirected edges 
B. We assume the V = {1, 2, . . . ,771} and that the 
subgraph of directed edges is acyclic and topologically 
ordered (that is, i ^- j G -D implies that i < j). Let 
PDn denote the set oi m x m symmetric positive def- 
inite matrices. Let PD{B) := {Q. g PDm '■ oJij = 
if i 7^ j and i ^ j ^ B}. 

The Gaussian structural equation for the graph G is 
a set of linear relationships between random variables 
Xi : i ^V induced by the graph and starting with cor- 
related noise terms. In particular, let e be a centered n- 
dimensional jointly normal random vector e ~ A/'(0, f2) 
such that il e PD{B). For each i ^ j e Diet Xij € K 
be a parameter. For each j G V define 



X.. 



i:i^rjeD 



XijXi 



The random vector X has a jointly normal distribution 
with X - 7V(0, E) where 

where A is the strictly upper triangular matrix with 
Kij — Xij ii i ^ j E D and Aij = otherwise. 

Said in the language of statistical models and map- 
pings between parameter spaces in the previous sec- 
tion, the Gaussian structural equation model is a 
map that associates to a parameter vector (A, f2) S 
m.*^ X PD{B) the normal distribution 7V(0, E). 

The parameters of most frequent interest in structural 
equations models are the entries of A and the entries 
of (/ — A)~^. The parameter A^ is called the direct 
causal effect of Xi on Xj. The parameter (/ — A)^- 
is called the total effect of Xi on Xj . Parameter iden- 
tification in these structural equation models asks for 
formulas to recover the direct and total effects given 
the covariance matrix E. 

Note that by basic properties of algebraic graph the- 
ory, the entries in (/ — A)^^ have a combinatorial in- 
terpretation in terms of directed paths in the graph G. 



A directed path from i to j is a sequence of directed 
edges i —7> ii —7' «2 —>■••• ^ j- The set of ah paths 
from i to j is denoted V{i,j). Then 



(i-^^- E n 



Aa 



Another type of parameter of occasional interest are 
the path specific effects which come from the monomi- 



als n 



k^ieP 



XkiioT P&Vii,j) 



If all the Xij parameters are (generically) identifiable, 
then so too are the entries of il. Indeed, once one can 
determine all the A^ we can simply determine fi by 



n = {i-Afj:{i-A). 



(1) 



This fact provides another reason why much of the 
attention is focused on only parameters that involve 
A when studying identifiability problems for Gaussian 
graphical models. 

Note that since both the direct effects and total effects 
are polynomial, we can employ the techniques from the 
previous section to decide whether or not parameters 
in the model are identifiable. 

Example 6. Consider the graph G in Figure 1. 



section, we compute the elimination polynomials to 
deduce that every parameter except loh is the solution 
to a quadratic polynomial with coefficients determined 
by E. For example, A23 is the solution to the quadratic 
equation: 



0"13(T220"24 
^2 



)1'+ 



0"130"230'24 



— 0-i4a220-33 — 0-140-23 + O12O-24O33 + O-13O22O34 

— 012023034)5 + (014O23O33 — O13O24O33). 



Consider the polynomials for Ay . Since we know there 
is one real solution for E e f (6), both solutions for A^ 
are real. This means that there are exactly two real 
lambda matrices compatible with a given S. Solving 
for rj as 

A)^E(/- 



n^{i 



A). 



with both real choices of A give two real possibilities 
for n, which hence must be the real roots of the identi- 
fication polynomials for O. Since E is positive definite, 
so must be J7 in both these cases. This implies that 
in this case f is a generically 2-to-l map, that is, the 
model is 2-identified. 

4 COMPUTATIONAL RESULTS 



®- 



►®- 



►® 



Figure 1: Instrumental Variable 

This model is traditionally called an instrumental vari- 
ables model (random variable Xi is the instrument). 
The formulas for atj in terms of A and fl are obtained 
from the factorization of E = {I—A)~^Q{I—A)~^, are 
simply the fij from Example [T| This model is gener- 
ically identifiable but not identifiable, since it is not 
possible to identify A23 when A12 = 0. 

Example 7. Consider the graph G in Figure 2. This 




Figure 2: A 2-identified SEM 
example is originally due to Brito and it shows that the 



"GAV-criterion" described in ( Brito and Pearl 2002b ) 



is not complete. Using the ideas from the previous 



In this section we present our computational results on 
the identification of all Gaussian structural equation 
models on three and four random variables. Using the 
ideas presented in Section [2] we computed all gener- 
ically identifiable parameters for each of the 2^ — 64 
models on three variables and each of the 2^^ = 4096 
models on four variables. The parameters identified 
are the direct causal effects (the entries of A) , the to- 
tal effects (the entries of (/ — A)^^), the path specific 
effects, and the entries of Q. The results of these com- 
putations are displayed on our project website: 

http : //graphicalmodels . info/ 

The website contains all identifiability results in for- 
matted text. It also displays colored pictures encoding 
identifiability of parameters in A and il. The param- 
eter ujii in n is represented by the node labeled Xi in 
the colored graph. An edge or a node is colored green 
if the associated parameter is generically identified, it 
is colored blue it is is algebraically fc-identifiable with 
k > 2, otherwise it is colored red. On a black and 
white print-out, a green edge is recognized by a circle 
surrounding the label of the corresponding parameter, 
a blue edge by an ellipse and a red edge by a rectan- 
gle. For example the graphical model in Figure [3] has 
generically identified direct effect A23 and generically 
identified fl parameters wn, UJ24 and ^33. 



The colored graph does not encode total effects or 
path-specific effects, but in this example, the total ef- 
fect of ^2 on X4, given by the polynomial A23A34-I-A24, 
is generically identified as the solution to the equation 

but does not satisfy the back-door criterion. No other 
total or path-specific effect in this graph is identified. 




(i) there are exactly 31 graphs that are generically 
identifiable and 33 graphs that are not generically 
identifiable. 

(ii) The single-door criterion and instrumental vari- 
ables form a complete method to generically iden- 
tify direct causal effects for SEM models on three 
variables. 

Proof. There are 27 bow-free models on three vari- 
ables, that is, models satisfying the condition that the 
errors for variables i and j are uncorrelated if vari- 
able i occurs in the structural equation for variable 



Brito (2004) shows that every bow- free model is 



generically identified. Our computations show that all 
direct causal effect parameters in a bow-free model on 
three variables are generically identified by the single- 
door criterion. 

Table [T] lists the four remaining generically identifiable 
models. Each of these graphs has exactly one direct 
causal effect parameter which is identified by an instru- 
mental variable but not by the single-door criterion. 



Figure 3: Four variable SEM model with total effect 
of X2 on X4 generically identifiable. 

Besides the inherent usefulness of solving the (generic) 
identifiability problem for all models on three and 
four variables, our main motivation for creating this 
database is that it can be used to test the efficacy and 
correctness of current and future graphical criteria for 
identifiability. For this reason, we have also computed 
which direct causal effects are identified by the single- 
door criterion or instrumental variables, and which to- 
tal effects are identified by the back-door criterion. 



We have developed a Singular (Greuel et al. 2009) li- 



brary to perform all the previous computations. The 
library requires the latest version of Singular. Its 
graphing capabilities require some special I^Tf^X pack- 
ages and a Mac OS X environment. The library, its 
documentation and installation instructions can be 
found on the website. Currently we are also port- 



ing this library to Macaulay2 ( Grayson and Stillman ) 



Future plans include extending this library to include 
more graphical criteria. 

In the next subsections, we summarize the results of 
these computations for 3 and 4 random variables. 

4.1 THREE RANDOM VARIABLES 
Theorem 8. Of the 64 graphs on three vertices, 



Table 1: SEMs on three variables with one generically 
identified direct causal effect by an instrumental vari- 
able. 

Directed edges Bidirected edges 
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The remaining 33 graphs are not generically identifi- 
able. Nevertheless, 13 of these non-identifiable graphs 
contain at least one identified direct causal effect. D 

The previous theorem describes the identification of 
direct causal effects. Nevertheless, if a model is not 
identifiable. Equation IT] cannot be used to identify the 
parameters in il. To our knowledge, there are no 
graphical methods to identify these parameters. So 
even for this simple model our algebraic approach pro- 
vides new insights. For example, in the model with 
directed edges 1 — ?► 2, 2 — ?► 3 and bidirected edges 
1 O 2, 2 o 3, the parameter A23 is generically identi- 
fied by an instrumental variable but A12 is not generi- 
cally identified. The parameter wn is identifiable and 
a;23, and a;33 are generically identified. The parameter 
a;23 is generically identifiable by the formula: 



1^23 



CT12C23 — cri3Cr22 
1712 



4.2 FOUR RANDOM VARIABLES 

The case of models on four random variables gets sig- 
nificantly harder. First of all, there are 2^^ = 4096 
graphs, some of them are not generically identifiable 
but 2-identifiable. In these cases, no graphical crite- 
rion can identify these parameters. ExamplelTlexhibits 
this behaviour. Moreover, even when the models are 
generically identifiable or just some subset of parame- 
ters are generically identified, the featured graphical 
criteria are no longer complete, i.e., there are sev- 
eral SEM models on four variables where the alge- 
braic method is the only tested approach capable of 
(generically) identifying certain parameters. Table ^ 
lists some examples with this behaviour. It remains 
to be seen whether the same statement holds when we 
include even more of the existing graphical criteria. 



Table 2: Three SEMs on four variables with two gener- 
ically identified parameters via algebraic methods. 

Directed edges Bidirected edges 



1 ^3,2^4,3^4 


lo 2, lo3,l o4 


1 ^ 2,2^3,2^4 


lo2, lo3,2o4 


1^ 2,1 ^3,2^3,3^4 


2o 3,2o4 



While most of the models took seconds to be com- 
puted, there were a handful of graphs (< 100) that 
took weeks or even months to be identified. While 
large numbers of edges and bows is expected in graphs 
with this anomalous behavior, no other apparent com- 
binatorial description was easily identified. For exam- 
ple, in the model with directed edges 1 — > 2, 1 ^ 
4, 2 — > 3, 3 — > 4 and bidirected edges 1^2, 1 o 
3, 1 O 4 the computation to show that 0^44 is not 
identifiable took more that 75 days. 

The following theorem summarizes our findings. 

Theorem 9. Of the 4096 graphs on four variables 

(i) exactly 1246 are generically identifiable, 6 are al- 
gebraically 2-identified, and 2844 are not generi- 
cally identifiable. 

(ii) Of the 1246 generically identifiable models, ex- 
actly 1093 are generically identified by the single- 
door and instrumental variables criteria and the 
remaining 153 generically identified models con- 
tain direct causal effect parameters only identified 
by the algebraic method. 

(Hi) There are exactly 729 bow-free models, each gener- 
ically identified by the single- door criterion. 

Table |3] lists the 6 algebraically 2-identified SEM mod- 
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els on four variables and the time (in seconds) to per- 
form all computations. 



Table 3: Algebraically 2-identified SEMs on four vari- 
ables. 

Directed edges Bidirected edges Time 



lo2,lo3,lo4 4.5 

lo 2,10 3,10 4 0.7 

lo 2,10 3,10 4 0.6 

lo2,lo3,lo4 1.1 

lf^2,lo3,lo4 0.9 

1 f^ 2,1 f^3,l f^4 0.3 



The first SEM model in Table [3] corresponds to Ex- 
ample [Tj Each of the remaining five models exhibit a 
similar behavior, luh is identified and all direct causal 
effects are the solution to a quadratic polynomial with 
coefficients determined by S. Nonetheless, in the sec- 
ond model the parameters ^33 and ^44 are generi- 
cally identifiable and in the last model the parameters 
■^22 , 1^33 and a;44 are generically identifiable. 

4.3 CONSTRAINT SETS 

As described in Section [2J the identification of a par- 
ticular parameter in a model parametrized by f re- 
quires the computation of the vanishing ideal X(f (0)), 
or constraint set. Our website also displays the results 
of these vanishing ideal computations, and determines 
when the ideals I(f (8)) are generated by detcrminan- 
tal constraints (generalizations of conditional indepen- 



dence constraints), using trek separation (Sullivant et 



2008) 



Theorem 10. The vanishing ideal of any structural 
equation model on three variables is deter minantal. Of 
the 4096 structural equation model on four random 
variables, the vanishing ideals of exactly 33 are not 
determinantal. 

5 DISCUSSION 

We have described a framework using computational 
algebra for determining whether or not parameters 
are identifiable in statistical models. We used our 
framework to provide the first systematic study of the 
identifiability problem for structural equation models. 
We have displayed the results of our computations for 
graphs with three or four random variables on a search- 
able website. Developing a general characterization of 
which parameters for which graphs are, in fact, identi- 
fiable remains a major open problem in the theoretical 
study of structural equation models. 



One observation arising from our large scale computa- 
tional study is that two graphs which combinatorially 
seem very similar might have drastically different run- 
ning times when it comes to verifying if the parameters 
in the model are identifiable. The longest computa- 
tions seemed to occur in the proofs of noTiidentifiability 
of some of the parameters. This phenomenon deserves 
more careful study. In particular, we need to address 
the question of whether or not this has to do with our 
implementation (for example, if changing the term or- 
der might speed up computations) or if some graphs 
are simply intrinsically more difficult to prove or dis- 
prove identifiability. 
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